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We describe a class of observational strategies for probing the anisotropics in the cosmic mi- 
crowave background (CMB) where the instrument scans on rings which can be combined into an 
n-torus, the ring torus. This class has the remarkable property that it allows exact maximum like- 
lihood power spectrum estimation in of order N 2 operations (if the size of the data set is N) under 
circumstances which would previously have made this analysis intractable: correlated receiver noise, 
arbitrary asymmetric beam shapes and far side lobes, non-uniform distribution of integration time 
on the sky and partial sky coverage. This ease of computation gives us an important theoretical 
tool for understanding the impact of instrumental effects on CMB observables and hence for the 
design and analysis of the CMB observations of the future. There are members of this class which 
closely approximate the MAP and Planck satellite missions. We present a numerical example where 
we apply our ring torus methods to a simulated data set from a CMB mission covering a 20 degree 
patch on the sky to compute the maximum likelihood estimate of the power spectrum Ci with 
unprecedented efficiency. 



I. INTRODUCTION 



A major near-term objective in the field of Cosmology today is to gain a detailed measurement and statistical 
understanding of the anisotropics of the cosmic microwave background (CMB). While the theory of primary CMB 
anisotropy is well-developed (see jlj for a review) and we are facing a veritable flood of data from a new generation 
of instruments and missions, the single most limiting hurdle are the immense computational challenges one has to 
overcome to analyze these data J| Q] . 

We will now outline the key problems involved. The goal is to derive two things: an optimal sky map estimate and 
a set of power spectrum estimates Ci which, for a Gaussian CMB sky are a sufficient statistic, highly informative 
about cosmological parameters. In this paper we will focus on the Ci estimation problem, but the ideas we develop 
can be usefully employed for map-making as well j^]. 

We assume we start from a best estimate of the sky map of the cosmic microwave background fluctuations, the 
monopole and dipole components have been removed, and that residual foregrounds are negligible after excision 
of foreground dominated regions. The analysis problem is then one of spatial covariance estimation and involves 
maximizing the likelihood given the data as a function of the Cg. If the observed CMB anisotropy data is written as 
a vector d the likelihood takes the form 

_ e. P [-^(S(C,) + N)-d] 
^(2»)^|(S(C,) + N)| 

where the signal covariance matrix S is a function of the CV, N is the noise covariance matrix and Nd is the number 
of entries in the data vector d. If S + N is a general matrix, evaluating the inverse and determinant in this quantity 
takes ©(A^J) operations. For currently available data sets Nd ~ 10 5 , which is expected to grow to Nd ~ 10 7 by the 
end of this decade. 

Borrill Q presented a careful review of the computational tasks involved in solving this maximization problem in 
the general case, using state-of-the-art numerical methods. The conclusion is that in their present form these methods 
fail to be practical for forthcoming CMB data sets. Given the major international effort currently underway to observe 
the CMB (see, e.g. [^|-|)), finding a solution to this problem is of paramount importance. 

The underlying reason for this failure is that the signal covariance matrix S and noise covariance matrix N are 
usually very differently structured. The data d are usually represented by a vector containing the temperatures of 
pixels in a sky map and Nd = N P i X , the number of observed pixels. Hence entries in S and N are the covariances 
between pairs of pixels on the celestial sphere. In pixel space, S is full, while N is ideally sparse (for a well-controlled 
experiment). In spherical harmonic space we are in the opposite regime. If the CMB sky is isotropic S would be 
diagonal for all-sky observations, while the form of N depends on experimental details and observational strategy 
of the mission and in general could be any covariance matrix at all. Therefore the quantity which enters in the 
likelihood C = S + N is not sparse in any easily accessible basis. Other basis sets have been suggested, such as cut 
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sky harmonics and signal to noise eigenmodes |lO|]Tl| ], but changing into these more general bases takes O^N^^j 
operation. 

There have been several attempts to find approximate formulae for the likelihood. Several methods are based on 
replacing the likelihood conditioned on the data vector with an approximate likelihood based on the pseudo-CV (the 
power spectrum computed on the incomplete sky, possibly weighted by an apodizing window) [T^Jl^ | . These methods 

are very fast, scaling as OLN^f^J , unbiased, but sub-optimal to a degree which depends on the size of the survey 

area. A similar approach is to first compute an unbiased estimate of the correlation function in pixel space from the 
raw data and then transform to obtain the power spectrum in 0{N^ ix ^j operations [l4| ]. A hierarchical approximation 
of the likelihood reduces the computational scaling to 0(Np ix ) . Other authors have found minimum variance 
quadratic estimators jl6 17 but the calculations still require 0(A^ X ) operations. 

There is only one known class of observational strategies in which the (numerically) exact maximum likelihood 
solution can be computed faster, namely in 0(Np ix ) operations [jTsfl . This performance depends on a number of 
assumptions: azimuthally symmetric coverage of an azimuthally symmetric portion of the sky with an circularly 
symmetric beam and uncorrelated noise^j. For ease of reference we will call this class azimuthal strategies. 

In this paper we discuss a second class of observational strategies, introduced in j2(J, for which the maximum 
likelihood problem can be solved exactly^. This class is interesting in that it is the the first to admit fast maximum 
likelihood estimators which can analyze CMB data in the presence of correlated noise, non-uniform distribution of 
integration time, beam distortions, far side lobes, and particular types of partial sky coverage, all without needing to 
invoke approximations. Every forthcoming CMB datasets will be affected by some subset of these issues. 

Further interest for observational strategies in this class derives from the fact that they are natural choices from a 
practical point of view. Evidence for this is that current and future balloon and space missions have chosen strategies 
which are based on scanning on interlocked rings: Archeops scanned on rings, MAP's precessing scan generates a 
3-torus, and the simplest of the proposed scanning strategies for Planck generates a 2-torus. 

The key point is that for this class of observational strategies modeling observations in the time ordered domain 
simplifies the problem greatly. In section || we argue that several instrumental effects are modelled more easily in the 
time ordered data (TOD). To be able to formulate the likelihood problem in terms of the TOD one needs to be able to 
compute the expected signal covariance between two samples in the TOD. We achieve this using the Wandelt-Gorski 
method for fast convolution on the sphere p2| ] which we briefly review in section III . 

Using the results from these geometrical ideas we go on to formulate the likelihood problem for the Ci on the ring 
torus in section IV. We show that the correlation matrices have a special structure for a class of scanning strategies 
namely those where the TOD can be thought of as being wrapped on an n-torus, the ring torus. It turns out that 
both S and N are sparse. In fact, they are both block-diagonal with identical patterns of blocks; hence C is also block 
diagonal. The exact computational scaling of the evaluation of C{Cf) depends on the scanning strategy used, but if 
the ring torus is an 2-torus it takes only 0[N^\ operations. 

We then present a numerical example where er apply this method to simulated data in section Finally, we discuss 



the application of our ideas to real world missions and conclude with suggestions for future directions in section VI 



II. THE CASE FOR ANALYZING CMB DATA IN THE TIME-ORDERED DOMAIN 



In this section we will discuss features of realistic instruments and observational strategies and show that they are 
more easily modeled in the time ordered domain. 



1 This class contains the "ideal CMB experiment" with uniform full sky coverage, uncorrelated noise, and symmetric beam 
as a special case. For pixelisation schemes admitting fast transforms, such as HEALPix |l9| this ideal case can be solved in 

o(Np{^ operations. 

2 The analysis of CMB data on a one-dimensional data set consisting of a single ring was first discussed in |2l| . The generali- 
sation of this case to correlated received noise and arbitrary beam patterns is contained as a special case in the class we discuss 
here. 
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A. Correlated noise 

Current detectors have the feature that they add correlated noise to the signal. It appears to be a good approxi- 
mation to assume that this noise is stationary and circulant. If this is the case then it has a very simple correlation 
structure in Fourier space. But estimating the sky from the TOD projects these simple correlations into very compli- 
cated noise correlations between separated pixels, which are visible as striping in the maps. 

B. Non-uniform distribution of integration time 

For most missions, practical constraints on the scanning strategy force the available integration time to be allotted 
non-uniformly over the sky. This leads to coupling of the noise in spherical harmonic space, in addition to coupling 
due to partial sky coverage (see paragraph D below). However, in the TOD, integration time per sample is constant, 
by definition. 

C. Beam distortions and far side lobes 

Microwaves are macroscopic, with wavelengths of order 10~ 3 — 10~ 2 cm. Hence they will diffract around macro- 
scopic objects. This leads to distorted point spread functions and far side lobes. Simulations have to convolve these 
asymmetric beams with an input sky with foreground signals varying over many orders of magnitude from place to 
place. Motions of the instrument while a sample is taken leads to anisotropic smearing of the beam. Analyses have 
to deconvolve the observations to make accurate inferences about the underlying sky. Artifacts due to asymmetric 
beams and beam smearing affect the correlations of the signal in a raw, coadded sky map in a complicated way which 
depends on the scanning strategy of the instrument. If the beam is short in a particular direction, the data contains 
information on this short scale. Deconvolution can in principle reveal additional structure. 

D. Partial sky coverage 

The cosmological information is most visible in spherical harmonic space. In a homogeneous and isotropic Universe, 
where the perturbations are Gaussian, the power spectrum Ci completely encodes the statistical information which 
is present in a map of fluctuations. Statistical isotropy of the fluctuations around us also implies that the spherical 
harmonics Yg m are the natural basis for representing the CMB sky. But due to astrophysical foregrounds that need 
to be excised from the map or partial sky coverage by the instrument this isotropy is broken in actual data sets. This 
introduces additional signal and noise correlations in Yg m space, because there is a geometrical coupling between Yg m 
of different I and to. 



E. Pixelization effects 

Representing the data as a pixelised map on the sky is an invaluable and time-proven tool for visual analysis. 
However, it is not without possible disadvantages either. Binning samples into discrete sky pixels introduces an extra 
level of smoothing on the pixel scale and erases information, for example the exact distribution of sampling locations. 
Too large pixels will degrade the signal and erase information about the beam shape (cf issue C). If the distribution 
of samples on the sky is non-uniform choosing pixels too small will result in many missing pixels and hence irregularly 
shaped observation regions. 

There are currently no exact and fast methods that can deal with all of these issues. Yet, every forthcoming CMB 
dataset will be affected by some subset of these issues. 

It is clear from A and B that the noise properties arc much simpler in the TOD than in a sky map. But what 
about the signal correlations? What is the price of forfeiting the advantages of the sphere for representing statistically 
isotropic signals? In the next section we will demonstrate that one can devise scanning strategies which break only 
part of the symmetries of the sphere and make the signal correlations in the TOD as simple as the noise properties. 
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III. MODELING THE SIGNAL IN THE TIME-ORDERED DOMAIN 



Recently, Wandelt and Gorski |B2fl have introduced new methods for greatly speeding up convolutions of arbitrary 
functions on the sphere. This reference contains a detailed description of methods which will enable future missions 
such as MAP and Planck to take beam imperfections into account without resorting to approximations. The algorithm 
is completely general and can be applied to any kind of directional data, even tensor fields Q. We will now briefly 
review the ideas and quote the results. 

Let us define more precisely what is meant by "convolution on the sphere" . Mathematically, the convolution of 
two functions on the sphere is a function over the group of rotations SO (3), because one has to keep track of the full 
relative orientation of the beam to the sky, not just the direction it points in. For each possible relative orientation 
(element of SO (3)) the convolution returns one value. To specify the relative orientation one can use the Euler angles 
$1, O and $20. The convolved signal for each beam orientation (<£>i, O, $2) can then be written as 

T s ($ 2 , 0, $i) = J dfy [d($ 2 , O, $!)&](7)*s(7)- (2) 

Here the integration is over all solid angles, D is the operator of finite rotations such that Db is the rotated beam, 
and the asterisk denotes complex conjugation. At each point on the sphere, there is a ring of different convolution 
results corresponding to all relative orientations about this direction. This is a direct consequence of the fact that the 
beam is not assumed to be azimuthally symmetric. 

If L measures the larger of the inverse of the smallest length scale of the sky or beam, the numerical evaluation of the 
integral in Eq. (^) takes 0(L 2 ) operations for each tuple (<E>i, O, $2)- To allow subsequent interpolation at arbitrary 
locations it is sufficient to discretize each Euler angle into 0(L) points and thus we have C(L 3 ) combinations of 
them. As a result, the total computational cost for evaluating the convolution using Eq. @ scales as 0(L 5 ) . 

It turns out that by Fourier transforming the above equation on the Euler angles and judiciously factorising the 
rotation operator D($2, ©, $1) as described in |Q we can reduce this scaling to C(L 4 ) in general. One can further 
reduce the scaling to 0(L 3 ) if one is only interested in the convolution over a path which consists of a ring of rings 
at equal latitude around the sphere. We called such configurations basic scan paths in references p^ , pO| . 

We quote here the formula for the Fourier coefficients of the convolved signal on this ring torus using the notation 
of p^j. In order to do so we need to make use of the angular momentum representation of the rotation operator D. 
A simple explicit expression for the matrix elements D l mm ,((j)2,0,(f>i) can be given. One can define a real function 
such that 

D l mm ,{^e,4> x ) = e~^d l mm ,(0)e- m '^ (3) 

Thus the dependence of D on the Euler angles <pi and 02 is only in terms of complex exponentials. While explicit for- 
mulas for the d- functions exist [ p4| , they are more conveniently computed numerically using their recursion properties 
0- 

If the convolved signal on the ring torus is T^ p with r counting the ring number and p the pixel number on the ring, 
then its Fourier coefficients are 

T mm' =y^,Wmd e mm >(dE)Xe m >. (4) 
i 

Here the si m are the spherical harmonic multipoles for the CMB sky and 9e fixes the latitude of the spin axis on the 
sky. The quantity 

Xl m = ^2 diiM^WlM (5) 
M 

is just the rotation of the beam multipoles bg m for an arbitrary beam by 9, the opening angle of the scan circle. X^ m 
can be precomputed. Counting indices in Eq. (||) confirms that computing the convolution along a basic scan path 
takes only C(L 3 ) operations. 



3 Our Euler angle convention refers to active right handed rotations of a physical body in a fixed coordinate system. The 
coordinate axes stay in place under all rotations and the object rotates around the z, y and z axes by $1, O and $2, respectively, 
according to the right handed screw rule. 
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FIG. 1. The ring torus in three different representations. Figure A shows the scan rings on the sphere for one particular 
scanning strategy. We see the projection of a disc 20° across, covered with rings of radius 9 — 5° whose centers are 9e = 5° 
away from the north pole. The angles (p and <f>E parametrize the position on each ring and the ring location, respectively. 
Figure B shows the 2 dimensional ring torus generated by this scanning strategy and in Figure C the rings are unfolded into a 
2 dimensional plane like in Figures (pi) and m). 



Using the results from the previous section we can now write down a statistical model in terms of the likelihood of 
the Ci and experimental parameters given the time-ordered data. 

We will first derive the signal-correlation matrix as a function of the Ci and then go on to derive the noise correlation 
as a function of instrumental parameters (like the shape of the noise power spectrum in the TOD) and the scanning 
strategy (in this case the dimensions and position of the basic scan path). 



What we are interested in are the correlation properties of the signal on the ring torus, Eq. (Q). We can now easily 
compute the correlation matrix 



IV. LIKELIHOOD ANALYSIS ON THE RING TORUS 



A. Signal Correlations 
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(6) 
(7) 



where <5y denotes the Kronecker delta, defined as 



1 if i = j 



~ \ if i + j ■ 

The boxed Kronecker delta shows that the correlation matrix in Fourier space on the ring torus is block diagonal. 



(8) 



B. Noise Correlations 

We start by assuming that we can model the noise in the time ordered domain as a stationary Gaussian process 
whose Fourier components Tj? have the property 

T»T$*)=P{k)S kk ,. (9) 

In the following we will allow for the complication that the scanning strategy involves integrating repeatedly on a 
fixed ring then the signal remains the same for each scan of that ring. One can just co-add all scans over the same 
ring without losing information. 

To be specific, if we define Nr to be the number of pixels per ring, N c the number of times the satellite spins while 
observing a fixed ring, and N r to be the number of rings for a one-year mission, we can write Ntod = NrN c N t for 
the number of samples in the TOD. In the following we will choose N r = Nr for simplicity, though this is not required 
by the formalism and can be generalised trivially. 

To model the noise on the coadded rings, we define T pr to be the noise temperature in the pth pixel in the rth ring, 
such that 

N-l 

Tpr = ^ ^ Ap rn T n . (10) 

71 = 

Here A pr n is the co-adding matrix, defined in Eq. (|Al|) . It encodes the scanning strategy by associating a given pixel 
in the ring set with indices r and p with a set of samples in the TOD. 

Then we can show (see Appendix) that the covariance matrix of the Fourier components of the co-added noise on 
the rings is 

rrN — I fpN r^N* \ 

1 m'mM'M — \ 1 m'm 1 M'M J V 11 ) 

N r -l N r -1 

1 I x - ^ZE±( m ' p ^ M 'p') 



(N r f 



>mM 



p,p'=0 A=0 



where C(A,p — p') is defined in Eq. (A2). The boxed Kronecker-<5 is telling us that the noise covariance is block 
diagonal in the ring Fourier basis. 



C. Generalization to more complicated scanning strategies 

We just showed that by choosing the scanning strategy such that the TOD can be wrapped (co-added) onto a 
2-torus one can find a basis in which both noise and signal are block diagonal and hence sparse. What is more, the 
data can be easily transformed into this basis just by computing the Fast Fourier Transform (FFT) of the co-added 
TOD. 

The results Eq. ^ and Eq. ( p~T| ) warrant some comments. We can understand them in terms of symmetries. The 
ring torus exhibits a spatio-temporal symmetry: time proceeds linearly around the torus, in the same direction as the 
azimuthal angle 4>e (cf. Fig. (|l|)). Wrapping the TOD on the torus breaks the translation invariance of the noise in the 
TOD. Similarly, projecting the signal from the sphere onto the torus breaks the isotropy of the signal on the sphere. 
But these original symmetries manifest themselves as ring-stationarity of the noise (correlations between rings only 
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depend on the distance between them) and ring-stationarity/ "partial isotropy" of the signal (only in the azimuthal 
(f>E direction on the torus). These in turn express themselves as a simplified correlation structure in Fourier space. 

We can therefore see that the key feature which leads to the simplicity of the correlation matrices is the repetitive 
structure of the scanning in the <f>E direction. Therefore it is plain that these methods can be generalised to ring tori 
of arbitrary cross section, for example where the scanning rings are ellipses, without losing the computational scaling 
performance, albeit with more cumbersome formulae which we will not produce here in detail. This would also enable 
us to remove regions where foregrounds dominate if that involved removing the same part of every ring in the torus, 
because this does not spoil the ring-stationarity. 

From following the discussion in we can easily see how the results Eq. ([?]) and Eq. (^lj) generalize to precessing 
scans. For example rewriting Eq. (8) in |2^] as 

T m m> rn" = S£ m C?f„ m , {9 E )d l m , m ,, (Op)X £ * n „ . (12) 

e 

and substituting for Xi m >> from Eq. (|^) we obtain the signal part of the TOD for a precessing experiment where the spin 
axis moves around the sphere at co-latitude 6e and precesses about it with amplitude Op. We see that this TOD can 
be wrapped on a 3-torus. Computing the signal correlation matrix in Fourier space on this 3-torus T^ m , „ M M , M ,, 
we find that it is again block diagonal. However, the size of blocks has now increased. The computational performance 
in this case depends on the details of the scanning strategy, but in general will lead to a prefactor which increases 
with the precession angle 9p. 

For completeness wc mention that by introducing further factors of the rotation operator this formalism can be 
generalised to n-tori with n > 3, but this would appear to give rise to impractical scanning strategies. 



D. Constructing the likelihood for ring torus power spectrum estimation 



One can write down the likelihood Eq. ([!]) on the ring torus by simply substituting 

^map * ^ring torus? S ► T , N > T . (13) 

Due to the block diagonality, the inverse and determinant in the Eq. (|l|) only take 0(iV 2 ) operations to evaluate. 
The gradient is similarly easy to evaluate 

- 2^ = TriC-'CcJ - cFC-'Cc^d (14) 

which only takes C(L 3 ~ TV 3 / 2 ) operations to compute for each component. 



E. Confidence Regions 



To determine confidence regions for the estimates one could explore the likelihood surface around the maximum. 
If the number of estimated band powers is Nf,, using this recipe to evaluate the diagonal elements of the covariance 
matrix of the estimates would scale as O^N^N 2 } , and the full covariance (D(N 2 N 2 ^) . This is more like ©(iV 5 / 2 ) 
or 0(7V 3 ) depending on the coarseness of the binning. 

But one can improve on this, because evaluating the gradient of the likelihood also takes just 0(N 2 } operations 
on the ring torus. So one can numerically differentiate the gradient with respect to the i-th band power estimate, 
to obtain the i-th row of the covariance matrix. The full covariance matrix can then be computed in 0(NbN 2 ) 
operations. That's at most C(iV 5 / 2 ) but given that the Ce are smooth in most models Nt < 100 may suffice, 
especially for less than full sky coverage, where adjacent Cg estimates become strongly coupled. 



V. NUMERICAL EXAMPLE: POWER SPECTRUM ESTIMATION 



To test the method, we used a standard CDM power spectrum and simulated a realization of spherical harmonic 
multipoles a £rn up to L = 1024. We used the Wandelt-Gorski method for fast convolution to generate N r = 243 
scanning rings with 5° radius Nr — 243 samples each. The spin axis was offset from the north pole by 5°. This 
scanning pattern is shown in Fig. (0A). The data covers a polar cap on the sphere of diameter 20 degrees. For 
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simplicity we used a symmetric Gaussian beam with full width at half maximum of 18' - however this simplification 
is not forced by the method which allows using any beam pattern. 



000 
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FIG. 2. The noise model in the time ordered data. The vertical line shows the knee frequency. The frequency k is shown in 
units of the Nyquist frequency of the time stream Ntod /2. 



For the noise power spectrum we use a simple model motivated by measurements from realistic receivers - but 
again any power spectrum could be used here. We adopot the functional form used by p6| , p7| 

P(k) = 2o 2 [l + {k knee /k) a ], 0<a<2, (15) 

where the first term corresponds to white noise with variance per pixel of a 2 and the second to l//-type noise. We 
show this power spectrum in Fig. (||). For our simulations we regularize this form at small frequency by introducing 
a small additive constant corresponding to a tenth of the smallest representable frequency in the denominator of the 
1// term. Hence our noise simulation is not constrained to have zero mean but contains a random offset. 

We generated a realisation of noise with a = 1, a = 9/iK and kknee = 1. X 10~ 3 (expressed in terms of the Nyquist 
frequency Ntod/2) and added it to the ring set. The value of a was chosen to give S/N ~ 1 at £ = 600 for our 
simulations. We have N r = 243 rings with Nr — 243 pixels in each ring. For simplicity of implementation each ring 
was scanned a single time, N c = 1, giving a total number of Ntod = 59049 pixels, a similar number of pixels to the 
BOOMERAnG experiment. 

In Fig. (|^) we show the ringset with and in Fig. (^) without noise using the representation shown in Fig. (0C). On 
the plots, a line from the top to the bottom describes one ring, in this way the rings are put next to each other from 
left ro right. The line from left to right in the middle, with the same value, is the north pole where all rings meet. 
We see clearly the stripes from the correlated noise. 

Instead of estimating Ct, we estimate Di given by, 



D, = 



C e £(l + 1) 



e -<T*i(t+l) ' 

where a h is for a 18' FWHM beam. We divide D e in N b = 20 flat bins D X ,D 2 , D 20 with 50 multipoles in each 



(16) 
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FIG. 4. As Fig. (fel) but with correlated noise added, causing the vertical striping. 



The Fourier transformed signal correlation matrix can be calculated in two ways. One might evaluate it directly in 
Fourier space with equation (0), or one could first evaluate it in pixel space and then Fourier transform it using FFT. 
In our test example we have chosen the latter. The formula for the correlation matrix in pixel space is in our case 



where 89 is the angular distance between the two points pr and p'r' on the sphere and the sum over £ is taken over 
all £ values in the particular bin b. All 59 angles arc precomputed. The derivative with respect to a certain bin is 
then simply, 



In Figure (g) we have plotted the result from likelihood estimation. To find this estimate we initialised the iterative 
conjugate gradient scheme psj ] with a flat power spectrum Cg =constant. The minimum of the likelihood was found 
after about 30 function and derivative evaluation taking about 2 hours each on a single processor on a 500 MHz 
DEC Alpha Work Station. If we had started the likelihood minimisation with a better inital guess for the power 
spectrum (which could be obtained using one of the N 3 / 2 methods 0,^3) ) the number of likelihood evaluations could 
be significantly reduced. 




(17) 




(18) 
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FIG. 5. The result from a likelihood estimation of the power spectrum from a time stream with 243 rings with 243 pixel per 
ring. The error bars approximate asymmetric 2a confidence regions by showing where the loglikelihood drops by less than 2 
compared to the maximum. Three points at high i fail to detect significant power at 95% confidence and return upper limits 
on Ce. 

The solid line shows the input average power spectrum used and the crosses are the estimates with error bars. 
The errorbars approximate a 95% confidence region determined by finding the interval of the estimate where the 
log-likelihood is reduced by less than 2 compared to its maximum value. We compared these confidence regions 
with the errors bars determined from the observed Fisher matrix by finding the numerical second derivative of the 
loglikelihood. The square root of the diagonal components of its inverse yields errorbars which were consistent with 
the ones we obtained from the likelihood contours. 

Note that we did not subtract lower order modes from the sky patch that we simulated (though we assumed that the 
map was free of non-cosmological monopole and dipole modes). As we mentioned, the noise was also not constrained 
to average to zero. The presence of lower order modes was correctly modelled and handled by the algorithm. 



VI. CRITICAL DISCUSSION AND CONCLUSIONS 



In this paper we described the properties of the class of observational stratgies where the instrument scans on rings 
which can be arranged into an n-torus. Using these properties we constructed for the first time an exact formulation 
of the maximum likelihood problem of power spectrum analysis on the sphere which can deal with correlated noise, 
realistic beams, partial sky coverage, and non-uniform noise with a computational scaling of (D(A^ 2 ) . This is a 
significant advance over other available exact methods with the same scaling, all of which assume white noise and 
azimuthally symmetric beams, amongst other assumptions. 

How realistically can we expect these methods to be useful for real CMB missions? Real observations obviously 
never coincide exactly with any simple mathematical model. This was also true for the previously known solvable class 
of strategies of azimuthal sky coverage, symmetric beams and uncorrelated noise. But the fact that the likelihood 
simplifies in this idealised situation, defined one of the design goals for the MAP satellite, namely to ensure that the 
noise be uncorrelated between samples. This allowed a fast maximum likelihood power spectrum estimation method 
to be developed for MAP |Q which achieves success by employing iterative numerical schemes taking advantage 
of the fact that the data nearly satisfy the assumptions of the exactly solvable case and thus improve convergence. 
Perturbation theory is powerful but only if exact solutions are known that are close to the real thing. 
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In our case, even though the methods we describe can deal with many feat ures o f actual CMB missions, there 



remain discrepancies. We have not shown (except for our discussion in section fYC ) how to deal with small areas 
where foregrounds dominate and defy removal, such as bright dust emission from the galactic plane, or certain point 
sources. Also, irregularities in the scanning strategy will somewhat spoil the symmetry of spatial correlations in the 
signal and temporal correlations in the noise which the ring torus exploits. 

We therefore advocate a similar approach to the one which was used successfully in flq] , namely to use exact 
inverses, computed using the methods we presented in this paper as preconditioners to speed up conjugate gradient 
methods for computing C~ l d for more general cases, e.g. galactic cuts, point sources, imperfect scanning and non- 
stationarity of the noise. Other nuisance factors, such as discrete events like cosmic ray hits and short telemetry losses 
can plausibly be dealt with in established ways, namely by filling in the data stream with a constrained realisation 
based on the properties of the observed data. 

Even neglecting the computational advantages of ring torus methods there are other advantages to analysing time 
ordered data directly. First, this approach does not rely on map-making on the sphere for the purposes of power 
spectrum estimation. Not having to base Ci estimation on maps helps to avoid systematic effects or signal degradation 
introduced by previous analysis steps, for example a heuristic de-striping tool. On the ring torus, one fully models 
the striping due to correlated noise and hence does not rely on heuristic methods. 

While the sky map may not necessarily be the best starting point for the purposes of Ci estimation, one should of 
course make the best map one possibly can. Maps are indispensable for the visualization of the data and identifying 
systematic errors. Also, recently developed map based methods for estimating the noise properties of the observations 
from the data itself p9,B0| can be used to provide the necessary noise model input to our methods (such as P(k), Eq. 
©)• 

In fact the geometrical ideas we discuss in this paper may well be useful for map making itself. Reference |22] ] 
shows how the ring torus speeds up iterative methods for deconvolving a map with asymmetric beams. The block 
diagonal noise correlations on the torus mean that one can use the same geometrical ideas for solving the map making 
equations in the presence of noise. In fact both signal and noise covariances are sparse in this basis, hence this method 
can be used to quickly Wiener filter the data on the ring torus once the Ct are known to optimally estimate the CMB 
fluctuations on the torus. These can then be visualized by projecting them onto a sky map. 

Under the stated assumptions we have shown that the power spectrum analysis problem simplifies greatly when it 
is considered in the domain of co-added time-ordered data which has the geometry of a torus. We indicate how to 
derive extensions of the formulae we presented here to more general scanning strategies (e.g. a precessing scan, such 
as MAP's or another proposal for Planck). In this case the efficiency depends on the size of the extra dimensions 
which the torus acquires. 

However, even as they stand, the methods and ideas we describe represent a new theoretical tool which allows 
evaluating the impact of CMB mission design choices in much more detail than previously feasible. In order to 
achieve the best possible estimates of the CMB power spectrum, or certain cosmological parameters, one can ask 
whether it is more important to reduce the amount of correlated noise from the receiver, to ensure accurate pointing, 
to increase sky coverage, change the survey geometry, or to control the beam shape and far side lobes? We are 
investigating quantitative answers to these questions. 
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APPENDIX A: THE NOISE COVARIANCE ON THE RING TORUS 

The co-adding matrix can be written as 

Apm = -pj-S p ( n mod iV r )^ r ^ _a_ j) = J^-5 rx 5 pz (Al) 

Here [/J is defined to be the largest integer smaller than /. 

The previous equations simplify if we write n as n — N r N c x + N r y + z with x G [0,N r — 1], y G [0, iV c — 1], 
z G [0, N r — 1]. Now using that 
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P(k), 



we find that the noise covariance on the ring set is 



N-l 



N a -1 



C(r-r',p-p')^(T r N p T r %*) = 1 ^Y, P W E e~^ N '- N '^ +N ^^ + ^'^ . (A2) 

c k=o y,y'=0 

Note that C is periodic with period N r in its first entry; letting A = r — r' we have 

C(A, P -p') = C(A+jN r , P -p% jeZ. (A3) 

Finally we Fourier transform to arrive at the covariance of the Fourier components of the the co-added noise on the 
rings 



m'mM'M 



rpN rpN* 

1 m'm 1 M'M 
1 



(A4) 



(N r f 



N r -l 

E *- 

p,p'—0 



p- (m'p—M'p') 



Nt—1 



E e-^ mA C(A,p-p') 



A=0 



where we used (A3) to simplify the sums over r,r' and to obtain the boxed Kronecker-5 telling us that the noise 
covariance is block diagonal in the ring Fourier basis. 

For practical purposes one would first compute the Fourier transfrom in (A2) using an FFT taking ~ NlogN oper- 
ations. Then the resulting array can be interpolated at N r (2N r — 1) positions to obtain the independent components 
of C. Finally we have a three-dimensional FFT left to evaluate ([111), which requires \ogN r operations. 
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